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Abstract 

In the fc-arc connected subgraph problem, we are given a directed graph G and an integer k and the 
goal is the find a subgraph of minimum cost such that there are at least fc-arc disjoint paths between any 
pair of vertices. We give a simple (1 + l/fc)-approximation to the unweighted variant of the problem, 
where all arcs of G have the same cost. This improves on the 1 + 2/k approximation of Gabow et 
al. [GGTW09]. 

Similar to the 2-approximation algorithm for this problem [FJ81], our algorithm simply takes the 
union of a A: in-arborescence and a k out-arborescence. The main difference is in the selection of the 
two arborescences. Here, inspired by the recent applications of the rounding by sampling method (see 
e.g. [AGM + 10, MOS11, OSS11, AKS12]), we select the arborescences randomly by sampling from a 
distribution on unions of k arborescences that is defined based on an extreme point solution of the linear 
programming relaxation of the problem. In the analysis, we crucially utilize the sparsity property of the 
extreme point solution to upper-bound the size of the union of the sampled arborescences. 

To complement the algorithm, we also show that the integrality gap of the minimum cost strongly 
connected subgraph problem (i.e., when k = 1) is at least 3/2 — e, for any e > 0. Our integrality gap in- 
stance is inspired by the integrality gap example of the asymmetric traveling salesman problem [CGK06], 
hence providing further evidence of connections between the approximability of the two problems. 

1 Introduction 

In the minimum cost k-arc connected spanning subgraph (min-cost fc-ACSS) problem, we are given a directed 
graph G = (V, A) with cost c : A — > R on the arcs and a connectivity requirement k. The goal is to find 
a spanning subgraph G = (V, A') of G of minimum total cost which is k-arc connected, i.e., every pair of 
vertices have at least fc-arc disjoint paths between them. The special case of k = 1, 1-ACSS problem, is 
called the minimum cost strongly connected subgraph problem. In the unweighted variant of fc-ACSS, the 
minimum size k-arc connected spanning subgraph (min-size fc-ACSS) problem, where all arcs of G have the 
same cost, we want to minimize the number of arcs that we choose. 

The min-cost fc-ACSS problem has a 2-approximation algorithm [FJ81], and it has been a long standing 
open problem to improve this bound. Significant attention has been given to the unweighted variant of the 
problem. In particular, the minimum size strongly connected subgraph problem is very well studied [FJ81, 
KRY94, KRY96, VctOl, ZNI03], and the current best approximation ratio is 3/2, which is due to Vetta [VctOl]. 
The min-size fc-ACSS problem has been shown to be easier as fc increases [CT00, Gab04, GGTW09], and the 
best approximation ratio is 1 + 2/fc that is given in the work of Gabow et al. [GGTW09]. This approximation 
ratio is almost tight as the min-size fc-ACSS problem does not admit (1 + e/fc)-approximation, for some 
fixed e > 0, unless P=NP [GGTW09]. Similar to the directed case, the minimum size k-edge connected 
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subgraph spanning problem, an undirected variant of the min-size fc-ACSS problem, is known to be easier as 
k increases, and the best known approximation ratio for this problem is 1 + l/(2k) + 0(1/ k 2 ) due to Gabow 
and Gallagher [GG08]. 

1.1 Our Results 

In this paper, we give improved upper and lower bounds for the fc-ACSS problem. We first show the following 
improved algorithms for the min-size fc-ACSS problem. 

Theorem 1. For any k > 1, there is a min{7/4, 1 + 1 / k} -approximation algorithm for the min-size k-ACSS 
problem. 

Similar to the simple 2-approximation algorithm for the minimum-cost fc-ACSS problem, our algorithm 
takes the union of a fc in-arborescence and a fc out-arborescence. The main difference is in the selection of the 
two arborescences. Here, we select the arborescences randomly by sampling from a distribution on unions 
of fc arborescences that is defined by the linear programming relaxation of the problem. In particular, we 
write a convex combination of the unions of fc-arborescences such that the marginal probability of each arc 
is bounded above by its fraction in the solution of LP relaxation. 

The algorithm essentially employs the rounding by sampling method that recently has been applied to 
various problems in the algorithm design and online optimization literature (c.f. [AGM + 10, MOS11, OSS11, 
AKS12]), while the analysis is much simpler in our setting. Here, the main technical difference is a crucial 
use of the extreme point solutions of LP relaxation. In particular, because of the sparsity of the extreme 
point solutions, we can argue that the union of fc in-arborescences and fc out-arborescences is not much larger 
than the size of the support of the LP extreme point solution and thus the size of the optimum. 

Our result improves on the (1 + |)-approximation of Gabow et al. [GGTW09] for the min-size fc-ACSS 
problem, for any fc > 0. Furthermore, for the minimum size strongly connected subgraph problem, while we 
do not improve the approximation factor of | [VetOl], our algorithm is much simpler and gives a possible 
direction for weighted version of the problem. 

To complement the positive results, we prove that the integrality gap of the natural linear programming 
relaxation of the strongly connected subgraph problem is bounded below by 3/2 — e for any e > 0. 

Theorem 2. For any e > 0, the integrality gap of the standard linear programming relaxation for the 
minimum cost strongly connected subgraph problem is at least | — e. 

To the best of our knowledge, there is no explicit construction that gives a lower bound on the integrality 
gap of the minimum cost strongly connected subgraph problem. Our integrality gap example builds on a 
similar construction for the asymmetric traveling salesman problem [CGK06] and shows stronger connections 
between the two problems. 

1.2 Notations 

Let 6q(U) := {(u, v) £ E : u £ U, v £V\ U} denote the set of arcs leaving U in a graph G; if G is clear in 
the context, we will skip the subscript. 

A graph G is k-arc connected if and only if every (proper) subset of vertices U C V have at least fc leaving 
arcs, i.e., |<5q([/)| > fc, and G is strongly connected if it is 1-arc connected. We may drop the subscript if G 
is clear in the context. We use the following Linear Programming relaxation for fc-ACSS. 



(LP-ACSS) 



minimize 




subject to 



x{8+{U)) > k 



< x a < 1 



Veefi 
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where x(S + (U)) = ^2 ae s+(u) Xa - Throughout the paper x will always be an optimum solution of the 
(LP-ACSS). 

For any vector y : A R, and a set F C A of arcs, y(F) :— J^aeF Vai * s t Qe sum °f the va l ues of the arcs 
in F, and c(F) :— J^aeF °a is the sum of the cost of the arcs in F. Also, x{F) denotes the characteristic 
vector of the set F, i.e., x(F)a = 1 if a € -F and x(F)a = otherwise. 

2 An Approximation Algorithm for Min-Size /c-ACSS 

In this section, we prove Theorem 1: given a graph G, we give a polynomial time algorithm that finds a fc-arc 
connected subgraph of G such that it has no more than min{l + l/fc, 7/4} of the arcs of the optimum solution. 
Before describing the algorithm, we need to recall some of the properties of arborescences in directed graphs. 

Given a directed graph G and a (root) vertex r £ V , an r-out arborescence T of G is a directed tree 
rooted at r that contains a path from r to every other vertex of G. An r-out k- arborescence is a subgraph T 
of G that is the union of k arc-disjoint r-out arborescences. An r-in arborescence and an r-in k-arborescence 
are defined analogously. The following polyhedron plays an important role in the design and analysis of our 
algorithm. 

P°«* = { y : y{6+(U)) >k, V0 ^ U C V \ {r}, < y < 1} 

Frank [Fra79] showed that P out is the up hull of the convex hull of r-out fc-arborescences (see Corollary 
53.6a [Sch03]), and it can be seen that every feasible solution of (LP-ACSS) is a point in P out . Vempala 
and Carr [CV02] gave a polynomial-time algorithm that allows us to write a point x E P out as a convex 
combination of k arc-disjoint arborescences. Their algorithm requires a polynomial-time algorithm for finding 
an r-out A;-arborescences [Edm73, Gab91]. 

Lemma 3. [Fra79, CV02, Edm73, Gab91] P out is the convex hull of subsets of A containing r-out k- 
arborescences. Moreover, given any fractional solution y £ P out 7 there is a polynomial time algorithm that 
finds a convex combination of r-out k-arborescences, Ty, . . . ,Ti, such that 

i 

1=1 

The above lemma holds analogously for the r-in arborescences. Now, since x € pout ^ we can wr j^ e 
a distribution of r-out (in) k- arborescences such that probability of each arc a £ A chosen in a random 
arborescence is bounded above by x a : 

Corollary 4. There are distributions T>i n {r) andT> out (r) of r-in k-arborescences and r-out k-arborescences, 
such that the marginal value of each arc a £ A is bounded above by x a , i.e., for all arcs a £ A, 

PT~z> <n (r) [a G T] < x a , 
P T~x> out (r) [a £ T] < x a . 

Moreover, these distributions can be computed in polynomial time. 

Now, we are ready to describe our algorithm. We sample fc-arborescences T;„ and T out independently 
from T> in and T> OMt , respectively, and we then return T in U T out as an output. The details are described in 
Algorithm 1. 

Next, we show that the approximation ratio of the above algorithm is no more than 1 + 1/fc. 

Theorem 5. For any directed graph G, Algorithm 1 always produces a k-arc connected subgraph of G such 
that the expected size of the solution is no more than min{7/4, 1 + 1/k} of the optimum. 
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Algorithm 1 Approximation Algorithm for Min-Size fc-ACSS 



1: Solve (LP-ACSS) to get an optimum extreme point solution x. 

2: Find distributions 2?,;„(r) and T> out {r) on r-in and r-out fc-arborescences, respectively, such that the 

marginal value of each arc a G A is bounded above by x a . 
3: Sample an r-in fc-arborescence Tj n from T>i n (r) and an r-out fc-arborescence T out , independently, from 

T> ou t{r)- 
4: return T in U T out . 



Proof. First, we show that the union of any pair of r-in and r-out fc-arborescences is fc-arc connected. Let 
T in (T out ) be a r-in (r-out) fc-arborescence, and H — T in U T out . Since both T in and T out are unions of fc 
arc-disjoint arborescences, there are fc arc-disjoint paths from each of the vertices to r and fc arc-disjoint 
paths from r to each of the vertices. Therefore, H remains strongly connected after removing any set of 
fc — 1 arcs. Hence, H is fc-arc connected. 

It remains to show that the expected size of the solution is no more than min{l + 1/fc, 7/4} of the 
optimum, i.e., 

®T in ~V in (r),T out ~T> out (r) [\T in U T out \] . ( 7 l\ 

jOTTj - min \4' 1+ fcr 

To simplify the notation, we will skip the subscript and write E [|Tj n U T OU (|] to mean Er jn ^D in (r),T out ~i> ott i(r) D^in U T out \ 
Similarly, we will skip the subscripts for P:r 4n ~2> tn (r) [ a G and PT Mi ~p ral W [ a S T out ]. 
Since Tj n and T out are chosen independently, 

E[|T m UT out |] = { p i a e T in\ + P [« G T out ] - P [a G T in ] ■ P [a G T out ]} 



E-~ 2 



The last inequality follows from Corollary 4 and the fact that x a < 1 for all a G A. Let F := {a : 
< x a < 1} be the set of the fractional arcs (i.e., set of arcs with non-integer values in the solution of 
(LP-ACSS)). Since x is an optimal solution of (LP-ACSS), |OPT| > J2aeA x ^ Therefore, 

E [\T in U T out \] < ^ HaeA X a ~ J2a£A X a 
|OPT| " EaGA^a 

"(-f 1 ) — EaGF X a 



= 1 



— x(F) 2 /\F\ 

~ x(A) ' 1 ' 

where the last inequality follows from Jenson's inequality and the fact that fit) = —t 2 is a concave function. 

Since x is an extreme point solution of (LP-ACSS), a; is a sparse vector. It follows from the work of 
Melkonian and Tardos [MT04] (see also [GGTW09]), that the number of fractional arcs, \F\, is no more than 
An. Hence, 

x(F) - x{Ff /\F\ x(F)~x{Ff/An 1 

x{A) ~ x(A) ~ x(A) " fc' [ ' 

where the second inequality follows since x(F) — x(F) 2 /An attains its maximum at x(F) — 2n, and the last 
inequality follows from the fact that x(A) — J2 v ev x(S + (v)) > nk. On the other hand, since x(F) < x(A), 
we get 

x(F) ~ x(F ) 2 /\F\ 1 x(F) — x(F) 2 /2n 1 3 

x(A) ~2 2x(A) ~2 Ax(A)-A' [) 

The theorem simply follows by putting equations (1),(2),(3) together. □ 
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Remark 6. Since the distributions T>i n (r) and T> out (r) can be constructed such that the support of each 
distribution has size only polynomially large in n, the algorithm can be derandomized simply by choosing a 
pair of fc-arborescences that have the minimum number of arcs in their union. 

3 A Lower Bound on the Integrality Gap 

In this section, we prove Theorem 2: we show a lower-bound of 1.5 — e, for any arbitrary small e > 0, on 
the integrality gap of (LP-ACSS) for k = 1. Our construction is based on the LP-gap construction of the 
asymmetric traveling saleman problem by Charikar, Goemans and Karloff [CGK06]. 

3.1 Construction 

Let r > be an integral parameter that will be defined later. We start by defining the integrality gap 
example, G(d,s,t), by a recursive construction of depth d. In any graph G(d,s,t), d is the depth, r is the 
number of columns, s,t are the source, sink vertices, respectively. We allow s and t to be the same vertex. 
We will construct G(d, s, t) inductively such that it contains exactly r copies of G{d — 1, ., .). 

We start by describing G(l,s,t). The graph consists of s, t and r distinct vertices v\, . . . ,v r . Let vq = s 
and v r+ i — t; note that vq and v r +i may be the same depending on the given parameters s and t. For any 
1 < % < r + 1, we include arcs (vi,Vi-\) and (vi-\,Vi) in G(l,s,t). Therefore, 

A(G{l,s,t)) :={(«,-_!, Vi),(Vi,Vi-i),l <i<r + l}. 

Next, we define G{d, s, t). The graph consists of s, t and r distinct copies of G(d — 1, ., .). In particular, 
let vi, . . . , v r , u\, . . . , u r be 2r distinct vertices, and vq = u r+ \ = s and i> r +i = uq = t. For any 1 < i < r, 
include a distinct copy of G(d — 1, ., .) with source u t and sink v^. Also, for any 1 < i < r + 1, include the 
arcs (v,i,Vi-i) and Therefore, 

A{G(d,s,t)) := {(u,-!,^),^, 1 < i < r+ 1} U j (jA(G(d- l,tn,Vi))\ . 

Figure 3.1 illustrates the graph G(3, s, s) for r = 3. 

Our integrality gap example is G c i '■= G(d, s, s), where the source and the sink are unified. The i th column 
of Gd is defined to be the i th copy of the G(d — 1, ., .), i.e., G$ := G(d — l,Ui,Vi). The set of arcs that 
connect the r columns with s and t, i.e., A(Gd) \ U[=i A(G^}), are denoted by d th level arcs. Similarly, the 
I th level arcs of Gd are defined to be set of arcs included at the I level of induction. For example, the 
(d - l) th level arcs of G d are [J[ = i (A{Gf) \ \J. =1 A(Gf' j) fj , where Gf j) is the j th column of Gf. 

We define the costs of the arcs of Gd such that, for any 1 < I < d, the total cost of the arcs at level / is 
equal to 1. In other words, the cost of each arc at level I, Cd(l), is the reciprocal of the number of arcs at 
level I. By the construction of Gd, we have 

*(0 : = 2 (r + \)r«-r (4) 

3.2 Lower Bounding the Integrality Gap 

We show that for any d > 0, and for a sufficiently large r, the integrality gap of the instance G(d, s, s) is at 
least 3/2-0(l/d). 

Theorem 7. For any d > and r > d, the integrality gap of the instance G(d, s, s) is at least 3/2 — 8/d. 
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Figure 1: An illustration of the graph G(3, s, s), for r = 3. Note that the vertices labeled "s" on the left and 
on the right are the same. 



First, we show that the optimal value of the LP is at most d/2. Define x* a := 1/2 for all arcs a £ A(Gd). 
Charikar et al. [CGK06] show that x* belongs to the Held-Karp relaxation polytope [HK70]. Since any 
solution of the Held-Karp relaxation polytope is a feasible solution to (LP-ACSS) for k = 1, x* is also a 
feasible solution to (LP-ACSS). Furthermore, since the sum of the cost of the arcs of Gd is d, i.e., c(A(Gd)) — 
d, we have ^ a c(a)x* a = d/2. Hence, the optimal value of LP is at most d/2. 

Lemma 8 (Charikar et al. [CGK06]). For k — I, the optimum value of (LP-ACSS) for the graph Gd is at 
most d/2. 

For any d > 0, let Hd be the minimum cost strongly connected subgraph of Gd, and T(d) := c(A(Hd)) 
be the cost of Hd- In the rest of the section, we prove the following lemma: 

Lemma 9. For all d > 0, 

n d) >^- M . ( 5 ) 

4 r 

Let Hf :=H d n G d i] be the i th column of H d . Observe that Hf can be incident to (at most) four arcs 
of the d th level arcs of Hd- Let 

A d(i) ■= {(Vi,Vi-i), (v i+ i,Vi), (itj-ijUi), (ui,u i+ i)} n A(H d ), 

be the set of those arcs. We can lower-bound c(A(H^)) based on the number of arcs that is incident to 
(note that since H d is strongly connected, |Aj(i)| > 2): 

Case 1: \A d (i)\ > 3 

In this case, we must have 

c{A{Hf))>T{d-l)/r. (6) 

The inequality essentially follows from the fact that is a strongly connected subgraph of Gd-i- 
This is because the remaining arcs of the graph, Hd \ i?i , can only connect (or unify) the source and 



sink of i.e., and Wj. The 1/r factor follows from the fact that the cost of each arc of Gd-i is 
times the corresponding arc in G 



r 



(i) 



Case 2: = 2, and each of Ui and Uj is incident to exactly one arc of A d (i) 

Similar to the previous case, here we have 

c{A{Hf))>T{d-l)/r. (7) 

As we will see in Lemma 10, at most two columns of H d may satisfy this case. Therefore, although we 
have the worse lower-bound on c(H d ^) in this case, it has an insignificant effect on the final lower-bound. 



G 



Figure 2: An illustration of Hd where the second column satisfies Case 2. The black arcs represent the arcs 
of Hd, and grey arcs represent the removed arcs. Observe that every arc at level d is a min-cut of Hd- 

Case 3: = 2, and one of ui or Vi is incident to none of the arcs of Ad(i) 

Here we obtain a better lower-bound. For 1 < j < r, let i?l be the j th column of H^' with source 
itj j and sink Vi j. It follows that the only w i? i>i (or Vi, Ui) path in H d is the one that is made by the d— 1 

level arcs connecting the columns of , i.e., it,, u^i, 14^2, • ■ ■ , Ui,r, «i (resp. u^, Vi tT , t>i. r _i, . . . , Ui ( i,tti). 
Therefore, must contain all of the (d— l) th level arcs of G d . Since each column of H d is incident 
to 4 arcs of level (d — l) th , by repeated application of case 1, we obtain 

r 

c{A{Hf)) > 2{r + l)c d {d-l)+Y J c(A{H^ ) )) 

3 = 1 

= 2(r + l)c d (d-l)+ T{d ~ 2) . (8) 

r 

Next, we show that there are at most 2 columns satisfying the second case. 
Lemma 10. At most two columns of satisfy the second case. 

Proof. The proof is a simple case analysis argument. First, observe that there exists a column satisfying 
the second case in H d if and only if «j) ^ for some 1 < i < r + 1. Now, suppose this is 

the case. It then follows that Hd must contain all arcs at level d except these two arcs because each of the 
other arcs is a min-cut of Hd- See Figure 3.2. Therefore, all except (at most) two of the columns of Hd are 
adjacent to exactly 4 arcs at level d. □ 
Now we are ready to prove Lemma 9. 

Proof of Lemma 9. We prove by induction. First observe that T(0) = and T(l) = 1/2 satisfying (5). Let 
Ni,N 2l (i — N\ — N 2 ) be the number of columns satisfying case 1, 2, 3, respectively. We divide the cost of 
each arc at level d equally between the columns incident to it. This incurs a cost of 3cd(d)/2 to the columns 
satisfying case 1, Cd(d) to the rest of the columns and at least Cd{d) to the source vertex s (note that s is 
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adjacent to at least two arcs at level d). Using equations (6), (7), (8) we get: 

T(d) > c d (d) + min \N 1 (^ + J ^^) + N 2 (c d (d)+ Ti ^ 1 
o<N lt N 2 <r [ \ 2 r J \ r 

T(d-2) 



> min < N 



+ (r-N 1 - N 2 ) I c d (d) + 2(r + l)c d (d - 1) + 
/3c d (d) , T(d-l) 



0<7V<r [ \ 2 

+ (r - TV) (c d (d) + 2(r + l)c d (d - 1) + T{d ~ 2) 

> min (a [ ; 3r r + T(d - 1)^1 + (1 - a) ( ; 3r , + T(d - 2) 
" o<a<i\ ^4(r + l) V 7 V ; V 2 (^ + 1 ) 

> min{3/4 + r(d- l),3/2 + T(d- 2)} - 3/r. 

The second inequality follows from the fact that N 2 < 2. The third inequality follows from equation (4), 
and the last one follows from a simple algebra. 

Now, we may apply the induction hypothesis to T(d — 1) and T(d — 2). We get 

(3 3(d-l)-l 3(d-l) 3 3(d-2)-l 3(d-2)1 3 
T(d) > mm <^ - 1 



^4 4 r ' 2 4 r J r 

3d - 1 3d 

* — -T' 

which completes the proof. □ 
This completes the proof of Theorem 7. 



4 Conclusion 

We presented a simple (1 + 1 /^-approximation algorithm based on the rounding by sampling method for the 
minimum size fc-arc connected subgraph problem. Unlike recent applications of the rounding by sampling 
method [AGM + 10, OSS11], our algorithm has a flavor of the iterated rounding method [JaiOl] in its particular 
use of the extreme point solutions. The main open problem is to find a better than factor 2-approximation 
algorithm for the minimum cost strongly connected subgraph problem. 

We also showed that the integrality gap of the minimum cost strongly connected subgraph problem is at 
least 1.5 — e, for any e > 0. This leaves an interesting open question whether the lower bound of 1 + f2(l/fe) 
is achievable for the minimum size fc-arc connected subgraph problem as well. 

Acknowledgments: We thank Joseph Cheriyan for useful discussions on the preliminary construction of 
the integrality-gap instance. 
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